Genome measures used for quality control are dependent on gene function and ancestry

نویسندگان

  • Jing Wang
  • Leon Raskin
  • David C. Samuels
  • Shyr Yu
  • Yan Guo
چکیده

MOTIVATION The transition/transversion (Ti/Tv) ratio and heterozygous/nonreference-homozygous (het/nonref-hom) ratio have been commonly computed in genetic studies as a quality control (QC) measurement. Additionally, these two ratios are helpful in our understanding of the patterns of DNA sequence evolution. RESULTS To thoroughly understand these two genomic measures, we performed a study using 1000 Genomes Project (1000G) released genotype data (N=1092). An additional two datasets (N=581 and N=6) were used to validate our findings from the 1000G dataset. We compared the two ratios among continental ancestry, genome regions and gene functionality. We found that the Ti/Tv ratio can be used as a quality indicator for single nucleotide polymorphisms inferred from high-throughput sequencing data. The Ti/Tv ratio varies greatly by genome region and functionality, but not by ancestry. The het/nonref-hom ratio varies greatly by ancestry, but not by genome regions and functionality. Furthermore, extreme guanine + cytosine content (either high or low) is negatively associated with the Ti/Tv ratio magnitude. Thus, when performing QC assessment using these two measures, care must be taken to apply the correct thresholds based on ancestry and genome region. Failure to take these considerations into account at the QC stage will bias any following analysis. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of multiplex ligation-dependent probe amplification (MLPA) method in diagnosis of cancer and genetic disorders

Introduction: Lots of human diseases and syndromes result from partial or complete gene deletions and duplications or changes of certain specific chromosomal sequences. Many various methods are used to study the chromosomal aberrations including Comparative Genomic Hybridization (CGH), Fluorescent in Situ Hybridization (FISH), Southern blots, Multiplex Amplifiable Probe Hybridisation (MAP...

متن کامل

Molecular Comparison of Three Different Regions of the Genome of Infectious Bronchitis Virus Field Isolates and Vaccine Strains

Rapid detection and differentiation of infectious bronchitis virus (IBV) involved in the disease outbreak is very important for controlling disease and developing new vaccines. In the present study, three regions of the genome of IBV vaccine and field isolates including S1 gene, gene 3 and nucleocapsid (N) gene along with 3' untranslated region (3' UTR) were amplified and subjected to restricti...

متن کامل

Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data

Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as ...

متن کامل

Integrated production-Inventory model with price-dependent demand, imperfect quality, and investment in quality and inspection

In practice, manufacturing systems are never perfect and may have low quality outputs. Therefore, different decisions such as reprocessing, sale at lower prices or diminishing are made according to industry and market. This paper investigates the importance of supply chain coordination through developing two models in centralized decision-making for an imperfect quality manufacturing system wit...

متن کامل

Introducing a New SYBR green Real-time PCR for Detection of SARS-CoV2 Virus Genome

Background and purpose: There are various methods for molecular detection of SARS-CoV2 genome among which, PCR-based methods are the most reliable for making diagnosis. The majority of approved PCR kits for detection of Coronavirus are based on TaqMan real-time PCR which is expensive due to incorporating fluorescent and quencher-harboring probe. The aim of this study was to design a simple and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 31 3  شماره 

صفحات  -

تاریخ انتشار 2015